Use of Affective Visual Information for Summarization of Human-Centric Videos

نویسندگان

چکیده

The increasing volume of user-generated human-centric video content and its applications, such as retrieval browsing, require compact representations addressed by the summarization literature. Current supervised studies formulate a sequence-to-sequence learning problem, existing solutions often neglect surge view, which inherently contains affective content. In this study, we investigate affective-information enriched task for videos. First, train visual input-driven state-of-the-art continuous emotion recognition model (CER-NET) on RECOLA dataset to estimate activation valence attributes. Then, integrate estimated emotional attributes their high-level embeddings from CER-NET with information define proposed (AVSUM) architectures. addition, use attention improve AVSUM architectures propose two new based temporal (TA-AVSUM) spatial (SA-AVSUM). We conduct experiments TvSum COGNIMUSE datasets. attention-based TA-AVSUM architecture attains competitive performances strong improvements videos compared in terms F-score, self-defined face recall, rank correlation metrics.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

the use of appropriate madm model for ranking the vendors of mci equipments using fuzzy approach

abstract nowadays, the science of decision making has been paid to more attention due to the complexity of the problems of suppliers selection. as known, one of the efficient tools in economic and human resources development is the extension of communication networks in developing countries. so, the proper selection of suppliers of tc equipments is of concern very much. in this study, a ...

15 صفحه اول

A user-centric approach for event-driven summarization of surveillance videos

In this paper, a user-centric approach for video summarization is introduced. The method produces meaningful video summaries, by fusing low-level visual information, extracted by processing consecutive frames, with high-level information derived from detected events. The video summaries are presented to the user in the form of most representative frames, while an intuitive user interface allows...

متن کامل

analysis of reading comprehension needs of the students of paramedical studies: the case of the students of health information management (him)

چکیده ندارد.

15 صفحه اول

Affective Systems in Human-Centric Intelligent Environments

The beginning point for this research is the today findings on the human emotional intelligence, on the investigations of the role of the affects in the interaction design and on the different cultural approaches to the communication. In particular I want investigate the role of the physical embodiment, such as facial expressions, gestures. In general relationships between body parameters and b...

متن کامل

Contribution of Color Information in Visual Saliency Model for Videos

Much research has been concerned with the contribution of the low level features of a visual scene to the deployment of visual attention. Bottom-up saliency models have been developed to predict the location of gaze according to these features. So far, color besides to brightness, contrast and motion is considered as one of the primary features in computing bottom-up saliency. However, its cont...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Affective Computing

سال: 2022

ISSN: ['1949-3045', '2371-9850']

DOI: https://doi.org/10.1109/taffc.2022.3222882